Towards an Open Service Architecture for Data Mining on the Grid

نویسندگان

  • Peter Brezany
  • Jürgen Hofer
  • Alexander Wöhrer
  • A Min Tjoa
چکیده

Across a wide variety of fields, huge datasets are being collected and accumulated at a dramatical pace. The datasets addressed by individual applications are very often heterogeneous and geographically distributed, and are used for collaboration by the communities of users, which are often large and also geographically distributed. There are major challenges involved in the efficient and reliable storage, fast processing, and extracting descriptive and predictive knowledge from this great mass of data. In this paper, we describe design principles and a service based software architecture of a novel infrastructure for distributed and high-performance data mining in Computational Grid environments. This architecture is designed and being implemented on top of the Globus 3.0 Alpha toolkit (it provides basic Grid services, such as authentication, information and resource management, etc.) and OGSA-DAI Grid Services (they provide basic access to Grid databases).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

WS-DAI-DM: An Interface Specification for Data Mining in Grid Environments

Providing the appropriate access means for data mining services in Grid Environment is principal for combination of Grid and data mining. The transition from centralized data mining process as they are in traditional tools to Grid-compliant and Grid-based data mining services that can coordinate with each other is important to extract useful and potential knowledge/patterns from distributed dat...

متن کامل

Biosimgrid: a Distributed Database for Biomolecular Simulations

Biomolecular simulations provide data on the conformational dynamics and energetics of complex biomolecular systems. We aim to exploit the e-science infrastructure developing in the UK to enable large scale analysis of the results of such simulations. In particular, the BioSimGrid project (www.biosimgrid.org) will provide a generic database for comparative analysis of simulations of biomolecule...

متن کامل

GridMiner: An Infrastructure for Data Mining on Computational Grids

Knowledge discovery in datasets integrated into Grids is a challenging research task. These large datasets are being collected and accumulated across a wide variety of fields, at a dramatical pace. They are often heterogeneous and geographically distributed and globally used by large user communities. There are major challenges involved in the efficient and reliable storage, fast processing, in...

متن کامل

Architectural Plan for Constructing Fault Tolerable Workflow Engines Based on Grid Service

In this paper the design and implementation of fault tolerable architecture for scientific workflow engines is presented. The engines are assumed to be implemented as composite web services. Current architectures for workflow engines do not make any considerations for substituting faulty web services with correct ones at run time. The difficulty is to rollback the execution state of the workflo...

متن کامل

Development of a framework to evaluate service-oriented architecture governance using COBIT approach

Nowadays organizations require an effective governance framework for their service-oriented architecture (SOA) in order to enable them to use a framework to evaluate their current state governance and determine the governance requirements, and then to offer a suitable model for their governance. Various frameworks have been developed to evaluate the SOA governance. In this paper, a brief introd...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003